An Efficient Parallel Algorithm for High Dimensional Similarity Join - Parallel Processing Symposium, 1998, and Symposium on Parallel and Distributed Processing 1998. 19

نویسندگان

  • Khaled Alsabti
  • Sanjay Ranka
  • Vineet Singh
چکیده

Multidimensional similarity join finds pairs of multidimensional points that are within some small distance of each other: The 6-k-d-B tree has been proposed as a data structure that scales better as the number of dimensions increases compared to previous data structures. We present a cost model of the E-k-d-B tree and use it to optimize the leaf size. We present novel parallel algorithms for the similarity join using the E-k-d-B tree. A load-balancing strategy based on equi-depth histograms is shown to work well for uniform or low-skew situations, whereas another based on weighted equi-depth histograms works far better for highskew datasets. The latter strategy is only slightly slower than the former strategy for low skew datasets. Furthel; its cost is proportional to the overall cost of the similarity join.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient RMS Admission Control and its Application to Multiprocessor Scheduling - Parallel Processing Symposium, 1998, and Symposium on Parallel and Distributed Processing 1998. 19

A real-time system must execute functionally correct computations in a timely mannel: In order to guarantee that all tasks accepted in the system will meet their timing requirements, an admission control algorithm must be used to only accept tasks whose requirements can be satisfied. Rate-monotonic scheduling (RMS) is arguably the best known scheduling policy for periodic real-time tasks on uni...

متن کامل

Experimental Validation of Parallel Computation Models on the Intel Paragon - Parallel Processing Symposium, 1998, and Symposium on Parallel and Distributed Processing 1998. 19

Experimental data validating some of the proposed parallel computation models on the Intel Paragon is presented. This architecture is characterized by a large bandwidth and a relatively large startup cost of a message transmission, which makes it extremely important to employ bulk transfers. The models considered are the BSP model, in which it is assumed that all messages have a fixed short siz...

متن کامل

Solving the Problem of Scheduling Unrelated Parallel Machines with Limited Access to Jobs

Nowadays, by successful application of on time production concept in other concepts like production management and storage, the need to complete the processing of jobs in their delivery time is considered a key issue in industrial environments. Unrelated parallel machines scheduling is a general mood of classic problems of parallel machines. In some of the applications of unrelated parallel mac...

متن کامل

Solving the Problem of Scheduling Unrelated Parallel Machines with Limited Access to Jobs

Nowadays, by successful application of on time production concept in other concepts like production management and storage, the need to complete the processing of jobs in their delivery time is considered a key issue in industrial environments. Unrelated parallel machines scheduling is a general mood of classic problems of parallel machines. In some of the applications of unrelated parallel mac...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998